2.4. Random Survival Forests

Concept

Random survival forests, an ensemble method for analysing right censored data, first introduced by Ishwaran et al, 2008. RSF has several advantages over Cox regression: (i) Unlike Cox regression, RSF does not rely on proportional hazard assumption. (ii) RSF accounts for nonlinear effects and interactions for factor variables.

Usage

A random survival forests analysis can be conducted by applying the following steps:

  1. Select the analysis method as Random Survival Forests from Analysis tab.
  2. Select suitable variables for the analysis, such as survival time, status variable, category value for status variable, and categorical and continuous predictors for the model.
  3. In advanced options, interaction terms, strata terms and time dependent covariates can be added to the model. Moreover, if there are multiple records for observations, users can specify it by clicking Multiple ID checkbox. From RSF options, number of tree, bootstrap method, randomly selected number of variable, minimum number of cases in terminal node, maximum depth for a tree, splitting rule, number of split, missing values, number of iterations of the missing data algorithm, proximity of cases, size of bootstrap and type of bootstrap can be adjusted.
  4. Click Run button to run the analysis.

Cox Regression help

Outputs

## 
## Trees Grown:     633,    Time Remaining (sec):       2
a. Individual Survival Predictions

Survival predictions for each observation can be obtained. In this table, rows represent observations whereas columns represent time endpoints.

b. Individual Survival Predictions OOB

Out of bag (OOB) survival predictions for each observation can be obtained. In this table, rows represent observations whereas columns represent time endpoints.


c. Individual Cumulative Hazard Predictions

Cumulative hazard predictions for each observation can be obtained. In this table, rows represent observations whereas columns represent time endpoints.


d. Individual Cumulative Hazard Predictions OOB

Out of bag (OOB) cumulative hazard predictions for each observation can be obtained. In this table, rows represent observations whereas columns represent time endpoints.


e. Error Rate

An error rate table, which shows error rate estimations for each tree, can be obtained.


f. Feature Selection

A feature selection can be performed based on variable importance measure. The feature selection process is as follows:

  1. Fit data by RSF model and rank all available genes according to variable importance measure.
  2. Iteratively fit RSF model (do not calculate variable importance), and at each iteration remove a proportion of features from the bottom of the feature importance ranking list (default is 20%).
  3. Calculate the OOB error rate.
  4. Repeat Step 2 until dataset contains only 2 features.
  5. Find the set of features with the minimum number of features such that the OOB error rate is within 1 standard error.

Cox Regression help

An interactive plot can be created for selected features.

Cox Regression help


g. Overall Survival Plot

A survival plot can be created based on Nelson-Aalen estimator and overall ensemble predictions.

h. Individual Random Survival Plot

A survival plot can be drawn for survival predictions from random survival forests model. Each line represents a survival curve for each observation.


i. Individual Survival OOB Plot

A survival plot can be drawn for OOB survival predictions from random survival forests model. Each line represents a survival curve for each observation.


j. Individual Cumulative Hazard Plot

A cumulative hazard plot can be drawn for hazard predictions from random survival forests model. Each line represents a survival curve for each observation.


k. Individual Cumulative Hazard OOB Plot

A cumulative hazard plot can be drawn for OOB cumulative hazard predictions from random survival forests model. Each line represents a survival curve for each observation.


l. Error Rate Plot

An interactive error rate plot, which shows error rate alterations when number of trees increased, can be drawn.


m. Cox vs RSF

A Cox model can be compared to random survival forests model through an interactive plot for visual inspection of both models.

## 
## Trees Grown:     552,    Time Remaining (sec):       2